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processes in access occur in our minds, not in data files. Access to 
knowledge requires completing an IS path--connecting two minds across 
a variable physical segment. The special problems of access to the 
literature of the social sciences and the humanities are chiefly 
those of small classes with large variety to overcome. Certain 
variety-suppressing devices should be particularly helpful at this 
stage. However, there is a large, long term cost for the disciplines 
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Laurence B. Heilprin 

Abstract: The literature of knowledge is a ''very large system'* in the 
cybernetic sense of intractibility to control. Improving access to it 
needs some simplifying theoryo A step in this direction is a hypo- 
thesis constructed from a small number of basic concepts . These 
include cybernetic concepts of variety and requisite variety; a version 
of the mathematical concept of homomobphic mapping; and information- 
scientific concepts: an invariant 3 -segmented IS path, and short and 
long duration (SD and LD) modes of message propagation o Since all 
disciplines are symbiotic, defining a distinct IS domain is purely 
pragmatic o However, the IS concepts do define a domain, which acts 
as reference frame convenient for locating the substructures necessary 
and sufficient for cognitive access to literature. 

Cognition is visualized as two main processes: data gathering 
or acquisition of sensory variety, and data processing or homomorphic 
^'abstracting'* o Understanding is suggested as pattern recognition of 
the spectrum of abstractions into which sensory experience is decomposed « 
A simple model is given for a treelike (minimum class extension) 
abstract^level" structure. It is based on discrete quanta of variety 
(stored direct sense impressions, SSI*s) and "natural" associative 
processes, to which access is had by Pav] ovian "conditionally" associated 
stimuli (symbols) . A model of information search of an ID collection 
incorporates these basic processes: sensing, pattern recognition of 
prior abstractions, and alternation of two homomorphic mappings — 
decompression and selection ( decision making on surrogate or image 
collections) • Some familiar IS devices are interpreted as regulators of 
variety and its flow. The most, critical processes in access occur in 
our minds, not in data files, libraries or computers. Access to know- 
ledge requires completing an IS path — connecting two minds across a 
variable physical segment. 

The special problems of access to the literature of the social 
sciences and the humanities are chiefly those of small classes with 
large variety to overcome. Certain variety-suppressing devices such 
as thesauri should be particularly helpful at this stage. However, 
there is a large, long term cost ahead for the disciplines and professions 
concerned o 
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On Access to Knowledge in the Social Sciences and Humanities, From 
the Viewpoint of Cybernetics and Information Science 

Laurence B. Heilprin 
School of Library and Information Services 
and 

Computer Science Center, University of Maryland 



The subject of this conference: access to knowledge in the 
social sciences and humanities, forms part of the larger subject, 
access to knowledge. This in turn is part of a still larger 
subject, human miiid-to-mind communication. The last is an inmiense 
field which touches and borrows from every science. Nevertheless 
a central core of what we are after is found in information science 
or "informatics" (IS), and in cybernetics (C), a somewhat mc^e 
abstract discip.1 ;2 which goes beyond human communication to include 
communication in ^^^eneral, animal or machine. The two sciences may 
be roughly distinguished by the fact that in cybernetics communica- 
tion is the means to r^sulate and control systems of any kind, 
whereas in informatiion science communication is the end in itself, 
and is chiefly confined to humav> >mind- to-mind communication. 

The reason for including our subject within broader classes 
is simple. Htmian thinking is performed in terms of classes, and . 
a system of any kind is simply class located within a larger 
class its environment. The processes of the system may be 1 
divided info what have been calledv its ^^internal transformations**, 
and those proce^sses which cross its boundaries and relate it to 
its environment. Therefore a theory of access to kno\yledge in the 
social sciences and humanities includes two kinds of relations: 
those within the subject, and those relating the subjel'ct to its 
immediate environment^j knowledge, and to the larger environment, 
mind-mind communication. Since our everyday experience tends to 
make us more familiar with internal than with internal4external 
relationships, the first and larger par^. of this paper Is about 
the latter. We "get to the point" only toward the end.; However, 
I see no way to avoid the long introduction. In reality it is far 
too short, for it attempts a wide integrative cr interpretative view 
based on many sciences. 
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I. The Domain of Information Science 

C has such a broad domain, regulation and control, that it sub- 
tends all human ©cti'i^ity. However, other fields such as physics, 
biology, and the social and behavioural sciences are equally broad. 
None are independent, nor possible to isolate in the univei-se of 
knowledge. Therefore to ''define" IS can mean, realistically, no more 
than that it too can present certain aspects of this universe in use- 
ful, distinct cross-sectional view. 

Since x*7e ^im at basic concepts, let us keep their nvnmber small. 
In C we use two concepts: variety, and the law of requisite variety. 
The latter is described later, so we begin with variety. This concept 
has gained a central place not only in G but in iiologyj psychology, 
and other fields. As will be seen, it is a broad '^integrating concept*'. 
Variety is a property of a set, not of an individual. It is simply 
the number of discriniinable differences which an observer can make in 
observing some systm. Since the diiscrimination is made by the observer, 
the variety in a set luay be more (or less) for one observer (or dis- 
criminating system) than for another. Variety is m.^asured either as 
the number N of distinct discriminations, or as the logarithm of this 
number, log^N, where A is some arbitirary base (usually 2). The Variety 
in the following letter: 

Dear Dad: Please send money. Love. Your son, 

is 

1 (or log^l = 0) if the set is the message as a whole , or the 

unit set; 

8 (or log28 = 3) for tho set of words in the letter; 

12 (or log2l2" = 3.58) for the set of letters in the message, 

assuming we all count and discriminate identically. 

This concept will grow in usefulness as we go. We now introduce 
at greater length some proposed basic concepts for IS. These are: 
3-sef?[Eented IS path, SD avid LD modes of message propagation; and 
latftr, cartfii'n ©ufcryonic m<idels sucji'ag a !mlt?.wr>Lun abstraction Cf^ee 
modelj dn^ ^ rate model for Information search. Some of this hds 
appeared. 
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Figure 1 is a diagram representing schematically a one-way 
th of messa^ p:q).Rg0tion from an individual A to an individual B 

The over-all path consists of three segmenis, each composed of many stages. 
Segment ah is in individual A, the author or sender of the mcssauc. Segment be 
is in the environment or medium [or media] surrounding A and B. Segment 
is in individual 8, the recipient of the message. The entire path is physical in 
the seiise that aH of its stages are subject to the laws of physics. We will call he 
the external path segment (also "external segment," "externa! palh," or 
''physical link"), ah the organic efTerent path segment, cd the organic afferent 
path segment. Individuals A and B possess many peripheral sense organs and 
motor organs, but these are represented in each by a single effector or motor 
organ W, and by a single sensor or sensory organ 5. The organ Xf is used to 
modulate suitable physical systems used as message carriers throu'gh intervening 
medium he. The organ S is used for reception and iransduction of the modula- 
tion within the same sense channel. However, it is not necessary that both M 
and S correspond to the same sense channel, provided suitable iransiaters and 
transducers exist. Each individual, A or is provided with a nervous system 
that conducts afferent modulation from peripheral sense organ S to central 
region C, which we call simply the ^'mind.'" It also conducts efferent modulation 
from C or a region near C to peripheral motor organ M. The exact locations 
of A/, 5, and C within the body boundaries are not material to our picture. 
But the fact that part of the path of propagation lies in each body is essential. 

It will be noticed that what the communication engineer usually thinks of 
as the path of propagation is external path be. The stages of this part of the 

path extend from the boundary of the message sender through possibly many 
media to the boundary of the message recipient,. These stages may have widely 
differing conditions of propagation. They may include natural media such as 
air, water, and solids; or man-made media such as transmitters, receivers, and 
information-storage and -retrieval systems, he may include organic, possibly 
living structures. But in general, the media o^Jtside the boundaries of ^ and B 
are purely physical and "non-semantic," in the sense that they do not contain 
the special message-initiating and message-receiving equipment located in 
communicating organisms, Thus, relative to A and B, be is simply a set of 
stages not including the bodies of ^ and B, through which the modulation passes 
without the special processing that occurs in nervous systems (later referred to as 
"association"'). In he, modulation remains invariant if it is propagated in the 
absence of noise. We assume the range of propagation to be such that the power 
level remains high enough, and distortion low enough, for complete discrimina- 
tion by ^ of the modulation encoded by A. The message may be. amplified, 
regenerated, and transduced many times. But it remains simply modulation,' 
physically transducible into itself in the form in which it left ^^s modulating 
organ. With noise, the modulation deteriorates according to the second law 
of Aermodynamics, and its "bit content" of information deteriorates according 
to the mathematical theory of communication. In many ways the purely 
physical modulated carrier in path segment be is the simplest form assumed by a 
message as it passes from A to B, 
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Our first task will be to sketch a theory showing what communication is— 
how a message originates in a biopsychological medium ah. passes through 
a purely physical medium he, re-enters a biopsychological medium cci, and 
conveys meaning from mental terminal C(A) to terminat C(5). 

One obvious fact is that the stages in segments ah and cv/are the only struc- 
tural invariants in the path. The organic structures in ah and r^/ are built-in, 
fixed. Neural messages travel more or less fixed paths, paths (with almost no 
exceptions) beyond the power of either communicant to alter. By contrast the 
external path be or ^'physical link" is almost infinitely variable by man. Because 
of this, what occurs in organic segments ^/?and cdh more characteristic of the 
message than what occurs in segment be. The terminal segments are the only 
parts of the path in which meaning is encoded and decoded. 

A second point is that the human one-way communication path resembles 
a smiple control circuit. In Fig. 2(a), as in Fig. 1, a message is being sent by 
A to B, When the modulated signal leaves M(A), A has liule control over it 
unless he monitors what he sends. If he is speaking, he obtains feedback through 
S{A\ and adjusts his voice. If he is writing, he controls his message by visulil 
feedback. This feedback control of the modulated efferent signal is shown at 
F{A). 

A third point is that in operation the human one-way communication path 
resembles a rectifier. If B decodes A\ message and encodes a reply, the reply 
does not return through B by the same path as that of the incident messaee 
The message to B travels afferent path S(B)C(B). The message from B travels 
efferent path C(B)IW{B), with external return feedback through F{B) [Fig. 2(b)]. 
For one-way mind-to-mind communication an afferent path in B must be 
coupled to an efferent path in A, across the medium. For two-way communica- 
tion the coupling is both MiA)S(B) and M(B)S(A). A series of messages and 
replies (or conversation) produces an intermittent unidirectional "current" of 
modulation in a closed path. Terminals C(A) and C(B) function aiternately 
as modulation '^generators'' and ^'recorders," with a time lag or lead. Thus 
human communication resembles vacuum tubes and solid-state devices, in that 
it exhibits a rectifying action. The complete path acJ, the basic unit of human 
one-way communication when two persons are MS coupled, might be called 
an ''information rectifier" or a "modulation rectifier'' [Fig. 2'(a)]. 

When two paths admd ^/a are coupled for two-way communication [Fig. 2(b)], 
the two modulation rectifiers coupled in series (A/-^ and MS) form the simplest 
complete system of human communication. As the messap,e is encoded, 
sensory feedback at F{A) and F{B) monitors its alternating physical transduction! 
The message itself acts as a higher or "semantic" control feedback, altering 
and controlling the meaning encoded rn each reply. ^ 
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Before generalizing this model, let us sketch in one more 
feature: 

All messages may be divided into two classes: t:hose 
of short duration (SD) and those of long duration (ID). Of 
these, SD are the more basic in the sense uhat we could com- 
municate with only SD messages but not with only LD messages. 
A message in its simplest: form consists of two components. 
The first is some physical system, which we will call a 
"carrier,!' that is not in itself a message (examples: radio 
waves without the voice or music "intelligence'* ; a blank 
sheet of paper). The second consists in dtfescrimlnable marks on 
carrier such as images or sounds (SD) or printed letters or 
drawings (LD) which we will call "modulation". The carrier 
can exist without modulation but not the reverse. In SD 
messages, the modulation varies in time. The marks on sound 
waves in direct speech change constantly at the ear - in fact 
they must die away (be attenuated) rapidly, if we are to be 
able to discriminate the next words or musical notes. If 
they persisted for even a short time more than they do, the 
sounds of successive speech or music symbols would become 
indistinct and blurred. Reverberation would destroy the meaning 
conveyed by the modulation. It is of the essence that SD 
messages be attenuated at least as fast as they pass into the 
sensor of a human recipient or of a machine-receiver. Somewhat 
more formally: the attenuation rate of the channel which 
conveys information to the sensor must equal or exceed the 
information rate of the sensor. By information rate is meant 
the time rate of change of fully discr iminsble "least units" 
of information such as word-symbols, or of, their components, 
such as "bits". Unless we refer specifically to bits/second 
or other rate units, by "sensing rate" we shall mean *^7ords/second". 

The reason why SD messages are more basic than LD is 
simply that when the message passes into the sensor of man or 
machine it must do so in SD form. Human sensory (afferen^ and 
motor (efferent) messages travel by time-varying modulation to 
and frcm their destination or source - usually the brain. The 
same is true of machines which pass information through a sensor 
into some "decision" mechanism. 

In contrast with that of SD messages, the modulation of 
LD messages persists for comparatively enormous time intervals. 
In order to achieve this extension into the time dimension, the 
carrier is restricted in most cases to a solid, and the modulation, 
instead of temporal, is spatial. Printed letters on a page store 
their contained message for long periods. They do so by extending 
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spatially on the page. Naturally, since both SD and LD 
modulation exist in space and time, both arc-: four -dimen- 
sional Varks." But the far shorter time duration of SD 
modulation makes us refer to it as "purely temporal/' 
which it is not; and to LD as "purely spatial/' which 
it is not. 

Because of this constraint on communication - that a 
message must enter the sensor in the SD mode - all 
'^stored" or LD messages must be convertible to SD form. 
This is indeed the case. It is also true that many SD 
messages are convertible to LD mode, but this was not 
always so - and the conversion is man^s peculiar dis- 
covery. He found that information in LD mode can be 
stored, i.e. , propagated into the time^dimension, even 
beyond the life of the message sender. The discovery 
enabled the cumulation of knowledge - the possibility for y 
man's finite brain to tap a much largo: memory than his own. 
Using this technique of storage as a tool, he erected 
science and civilization. But it is a basic constraint on 
use of the t^ol that stored messages be retrans formed into 
the SD mode. 

The reason for introducing SD and LD modes is that they enable 
us to define v'^.h some precision the domain of information science, 
as distinct froin that of other sciences. Like all definitions, this 
one is only as sharp as the concepts (or the classes) used. Consider 
Figure 1 as an overall model of three coupled segments, such' that 
the couping between segments is always in the same relative spatial 
and temporal order shown. Spatially, it does not matter where 
recipient B is located, but B is always separated from A by an 
"isolating" segment, be. Temporally, A sends the message prior 
to the time at which B receives it. Two general classes of human 
communication now arise, depending on whether the message is 
propagated solely in SD mode, or is transformed into LD mode at 
some stage in segment be , and then retransf ormed to SD mode for 
terminal coupling with B. This is shown schematically in Fig. 3. 
With SD-only propagation the main time delays tend to occur at the 
terminals (source and sink, C; sender and recipient, 1$ • This 
involves a relatively tight bond. The recipient usually must 
identify the sender, or at least have a cliannel that connects back 
to him. The path becomes a closed loop typical of the conversational 
mode, and also of feedback regulation and control. The closed loop 
preddminates in all society, primitive end modern. Radio, telephone, 
television and other rapid signal transmission systems tend to retain 
the small time delays in the central segment, which are essential 
for closed loop communication. This is the main domain of cybernetics, 
for without control. of the sink by the source possibility of regulation 
is greatly reduced. On the other hand open loop communication is 
the principal domain of information science. We now define it 
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roughly, in terms of the concepts introduced so far. The proposed 
domain of information science is simply the set of all three- 
segment message paths (i.e., initial and final segments are living 
humans whose signal sending and receiving organs are appropriately 
matched to the channel of the signal in the central segment, the 
enviroianent ; and sender -environment coupling occurs prior to en- 
vironment-recipient coupling) such that the mode of signal propa- 
gation is usually but not exclusively SD-LD-SD (or briefly, '*SLS") 

It will be noticed that, since SLS predominates on the "IS 
path", time interval between the two couplings of the three segments 
may be very long. In fact, transduction at b may precede that at 
c by years, centuries, millenia. The LD message acts as a variable 
storage device and variable time delay which combines enormous 
versatility. A third kind of versatility is the "serial addition" 
of recipients, since the message may be "non-destructiv eiy" read 
out of its LD carrier at various times; and the "multiplication" 
of recipients through message replication. Thus the LD message 
enhances man's capacity to carry a message in his mind, lengthens 
its retention often beyond his lifetime, and extends the number 
of recipients beyond his capacity to contact them in space and 
time. If we regard the general goal of a message as "reaching a set 
of recipients", then the LD mode greatly increases the alternatives 
open to the sender for goal-fulfillment. The cybernetic "lav; of re- 
quisite variety" is as follows. 

Let E represent a set of "essential variables" which must be 
kept within certain limits in order to achieve a goal. Let D be 
a set of disturbances or threats to E which, by acting on E through 
the environment T, .csin drive E's variables out of the region of 
stability for attaining the goal. Finally, let R be a regulator 
interposed to keep D from disturbing E by driving its essential 
variables out of bounds.- Then the law states that R can success- 
fully "regulate" D only if it has "requisite variety". That is, R 
must have a variety of alternatives sufficient to counter the 
variety of disruptive alternatives open to D.g In stating and pre- 
senting this central law of cybernetics Ashby represents the 
contest between R and D as a two person "matrix game". The elements 
of the matrix are the outcomes of the interaction of D and R on E, 
acting through environment T. The game consists in two moves. D 
plays first, by selecting a row of the matrix; R counters, by 
selecting a column. If the outc^e (value of the matrix element 
jointly selected by D and R) is within the essential set of E 
(E's region of stability for achieving its goal), then R wins by 
"regulating" D. Otherwise R loses. Only variety in R can "drive 
down" or "destroy" the variety in D. 

By use of LD messages in addition to the basic and more 
primitive SD, man has enormously increased hit: capability to 
"regulate", i.e., reach, sets of recipients. The LD message, a 
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cornerstone of information science, may also be regarded as a 
cybernetic control device. It offers man increased variety of 
alternatives. The sender has the opportunity t:o "trade" com- 
parative certainty of designation of a small sec of recipients 
(usually known to him) using a closed loop, for uncertainty of 
designation of a vastly larger set of (usually unknown) 
recipients (open loop). Conversely, it offers the recipient a 
"trade" between comparative certainty of access to the messages 
of a small set of senders for uncertainty of access to those 
of a vastly larger set of senders. Viewed from IS the LD message 
is a space-time or four-dimensional switch; viewed from C it is 
a regulator against the "disturbance" of communication, with much 
more "requisite variety" for insuring that knowledge continue to 
be communicated than has the SD message alone. The messages of 
the arts and sciences represent communication with large sets of 
undesignated or loojsely designated recipients. For them the^LD 
mode, statistical frcin the sender's viewpoint, is universal. 

Finally, it will be r.oticed that the domain of information 
science has been defined in terms of phenomena accessible to 
social (objective) observation in middle segment be; and of 
phenomena not as yet accessible, in se<5nients ab and cd. • We must 
be careful, however, to avoid suggesting that the signal, a physical 
pattern of free energy (energy differences) as it crosses segment 
be, is ever observable in be. It is net what occurs in be must, 
like the messages that originate in ab, be observed in, and only in 
segii.ent cd. Another possible misconception is that this "purely" 
physical signal "carries meaning" along vith the pattern of energy 
differences which the sensor discriminates. Meaning exists as 
such only within humans. The locu^ of meaning in the univer<5e 
is a set of small regions like thbse at the ends of the "IS path". 
Physical continuity of meaning in crossing gap be of thiis path is 
an illusion. If a meaning seems to be "transmitted" from sender 
to recipient, what actually occurs is that the signal effects some 
change in the recipient which he interprets as meaningful. The change 
occurs in the interaction of the pattern of sensory variety with 
something else , considered below. The act, of meaningful communica- 
tion can be regarded from the viewpoint of IS as use of a signalling 
device whereby a sender operates switches in the mind of the recipient. 
From the viewpoint of C it can be regarded as another application 
of the law of requisite variety. 

The essential discontinuiety between sender and recipient 
(the environment that intervenes physically between them and is 
represetted ty segment be) introduces threatening variety — variety 
which (disturbs the basic need of individuals to act together, to 
form society. The variety m the ways of keeping the sender isnd 
recipient inc ommun ic^do is reduced and overcome by "taking advantage 
of the constraints""^^ These are, that it is physically possible to 



-10- 



L.B. Heilprin 



propagate, not an entire sensory pattern, but a "hotnomorphism" 
of it, from one brain to ?5Ti-ither. This important matheinatical 
and cybernetic concept has briefly been described in relation to 
IS . We now try to follow the sensory pattern, beginning with 
the direct sense impression (DSI) produced while a stimulus is 
acting, and the stored sense impression (SSI) or trace left in 
raemory after the stimulus ceases, and beyond. 

II, An IS Theory of Communication 

We begin with a paradox. In segment be what is propagated 
is a physical pattern, "modulation". All 'our cognitive mental 
contents either consist in such patterns or are derived from them. 
We can divide these mental contents into those that we cannot 
directly communicate (DSI's and SSI's) and those we can (something 
else). The paradox is that the signals that cause our DSI*s 
travel, while, given a DSI, we cannot originate a signal that will 
convey that DSI directly into another mind. T cannot transfer 
into your mind the scene I see. The paradox disappears if we 
recall t|ie recrif ier-like nature of the path of s 1-way message. 
The pattella of your DSI must originate outside you, in segment 
be. But it cannot originate farther back, in my segment ab. 
However, since we do communicate something all the way from a to 
d, theii that something cannot be a pattern of one of my DSl's. 
^Titt&P'it is 1» a subpattern , derived from DSl's and SSI's that 
entered me but which I cannot transmit out again. In other words, 
we cannot communicate the full patterns present in DSI's and SSI's, 
but only partial patterns extracted from them. 

A huge literature exists on the subject. It is not known, 
however, exactly how the brain extracts subpatterns from the full 
patterns. It is known that, as the DSl's are stored as SSI's, 
the brain associates them, possibly by the growth of bonds at the 
sites of the memory traces. For purposes of information science we 

distinguish two main kinds of association, "natural" and 
"conditioned". Natural association is less studied in psychology 
than conditio led. It is less controllable. It is responsible for 
the spontaneous (without our conscious control) formation of our 
concepts of objects, for our "reconstruction" of the environment. 
Conditioned (Pavlovian) association can be under conscious control. 
It occurs when we associate an arbitrary pattern (for example, 
a visual word pattern, or auditory sound pattern) with a natural 
pattern. That is, we "attach" the arbitrary pattern (symbol) to 
the naturally formed pattern (concept). Since the bond of 
conditioned association to a natural pattern can be made as tight 
as we choose, then the formation of the symbol as a DSI in the 
mind of a recipient will actuate or evoke the natural pattern 
into his awareness. (It can also be recalled or evoked by a 
DSI which, originally formed and is part of the natural pattern, 

e.g., by suddenly seeing a scene, or hearing a sound.) In this 
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way the sender of a message can send down his path (a^ a partial 
pattern, namely, that subpattern of a T)^l necessary to form a 
s\Tnbpl . The symbol is transduced at into the app?ircntiy full 
pat tern vhich eventually r^^'Tches thf* recipifiv/ at d. ^^ui evokes a 
conc*)pt. The sender of a , )i=ibolic par i lal -pa 1 1 ern "i'lc»,-;'" on the 
set of. stored concepts in the recnpic-it''^^ mind, evf^kin? oi^e after 
the other, and" in tbe process cau-cs ^he rcTipier'.t in experience 
a TTi^ssage . 

Tins sk'i^c^es the process of s^ipbo^ ic ccnie" t i ot) the use 
of arbitrary bat socially aTccepted .s-^/mtolic p^^ttr-ni'' to evoke the 
natural onrs wliicn onr mind*; form r.povUaneo;js 1 v . jlie rr.Tj mystery- 
r ?,mfi ins: h t)W t he brain i orr; s con' e p t s i p ^ t i a 1 pa t l p r n f . out 
of complete sensory patterns in the SSI's. 

III. Pattern Recognition and Abstraction 

It is probable that every DS i. remains "r.oth:Lng but" a physical 
pattern until, perhaps, it reaches the' cortex, 'in other words, there 
a s^-.age before which a DST is merely a pattern, and after which it 
is '*recognized'\ and in so domp; "acqui res meaning The process of 
recognition is preceded by another, i-^^lled ^'foa^nr? '^-ytract t on" . Tt 
is analogous to what Newton found occurs ^jhou k befm of white light, is 
passed through a prism. A parallel beam of vhirc light is decomposed 
into diverging colored beams. if these colored beams are ^jllected rind 
passed back in <me direct ron, thf=^y recom.bine into white light. That 
is^ a stimulus (beam) wl.ich produces in us tl)e percept'on of *\^hite" 
light is anaT>iSed by pbycical stracLure (prit.nO into components 
that we then perceive as "colored" light. Ana 1 ogov? I.y , when a DST. 
is still "merely a pattern" on tlie cortex, it coctalns, within its 
complete pattern of disor iminable differences, simpler sjopatterns 
into whicli it can be decamposed. A snbP'^tterr of a f u ' ler pattern 
is a "feature", or "character isticf' or "prnpertv" or "intern?.! rela- 
tion" of the fuller pattern. To perceive the fe.ature requires 
simplifying the full pattern-- merging or erasing all but that which 
displays the feature. Now since t>ie extracted pattern is no longer 
a full sensory display (the "concrete" kloJ which V/e associate with 
"reality") the extracted subpattern will not .appear to us like a 
concrete DSl . it will contain one or more e>ct rav-: t. ed features or 
relations We call it an '"abstraction". It well namer l . It 
is lirerally abstracted from sets nt ^"T*s which contain in their 
patterns the subpattern or featvro. It is Loss than fully pictorial, . 
and more relational, as indeed are our abstract cor.cepts. That is, 
it does not contain more relations than the orie^inal DSI, but more 
relations relative to its reduced oontovit. The Irain receives and 
stores perhaps millions of SSl*s. Ti orr thorn re formed, in 'he 
infant rapidly and adult more slowly, the part iai pat t ^^rn b'.o- 
physical structures that will, onc^.' f ormed , decompose new DST's 
into abstract components, much like white Hfiht into colored 
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components. Learning or induction consists in establishing the 
abstraction-structures. Deduction consists in using established 
abstractions to interpret new DSl's (experience), and predict their 
nature from their spectrum. I^en a new DSI pattern interacts with 
a set of partial-pattern structures it is rapidly and reliably de- 
composed into the abstraction spectrum of its properties. This enables 
us to instantly understand something we perceive, and respond 
"intelligently". And only in terms of this spectrum is a DSI either 
understandable or describable with symbols. The spectrum of its 
properties comprise its meaning. In this way a DSI, a pattern, 
up to some stage without meaning, through decomposition into partial 
patterns instantly "acquires'^ a characteristic spectrum which identifies 
it and is its meaning. The response time, especially if the abstraction- 
structures are well established, is very short, and we are unaware of 
an interval betv;een '^neaninglessness " and "meaningfulness" . At present 
a whole new field of psychology is opening around the complexity and 
size of the abstraction structures as measured by response time. We 
are of course unaware of all the many interactions going on. We 
perceive the spectx'imi. not as distinct relations but as a composite of 
properties the *\neaning". Actually a meaning is^^set of partial- 
patterns or concepts. Philosophers have long been aware that "an 
object" is intellectually equivalent to a set of properties observed 
through its behavicur. 

A great deal more could be said about this process, central 
to mental activity. Evidently it bears on human development 
the rate and kind of partial patterns that are set up in the brain 
bound the meaning which at any time the person can attribute to any 
DSI experienced. It bears on education the more or less socially 
planned, guided or aided establishment of partial patterns. As will 
be seen later, it determines what the librarian and information scientist 
does in "organization of knowledge". But before discussing this it is 
necessary to review several other matters. 

The point has been made that only by means of the abstract 
properties into which it is mentally decomposed can a DSI acquire 
meaning. The same is true when w€'. try to communicate it by composing 
its meaning in another mind, or "decjcribing" it. Since we cannot 
project the DSI or any concept directly into the mind of another, we 
string together some of its most salient properties (to us) and evoke 
them in the recipient's mind with a corresponding string of symbols. 
But this is the inverse of decomposition. In our own and the recipient's 
mind it is composition. In description we use symbols to reconstruct 
our own concept within the mind of another, much as we reconstruct 
white light from previously decomposed colored componentb . Providing 
the recipient also has the same set of abstractions, our symbols 
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evoke them in him and he "follows'* rapidly and easily. Indeed, 
if he happens to have a more detailed or finer set of abstractions, 
he may by association perceive more meaning (subjective to him) than 
does the seader. Any description, any coirmimication of meaning, 
requires at least a few shared abstractions, and is possible only 
because t:heir partial patterns have been formed in both communicants. 
So in communication we invert the analytical process of feature 
extraction, and synth/esize the described conctept. One of the prime 
problems or^^science i "^jiiiiiiiiiiwii » thac m reconstituting any 
concept through its spectrum, the natural order of DST/s observed 
in a phenomenon is faithfully preserved in. the concept transmitted by 
symbols. '^Operationa 1'* dfifinitions attempt to retain *'ob jectivity'* 
in scientific description by describing concepts only in terms of 
those derived from actually or potentially performable operations. 

**The concept is synonymous with the corresponding set of 
operations". H 

Without such safeguards it is easy to synthesize in the rr^ind 
an apparently plausible but actually inconsistent mental construct. 
The assumption that nature cannot be inconsistent only those who 
describe nature is the basis for much of scientific method, 
philosophy and logic. We now consider another aspect of abstraction. 

IV. A Theory of Classes and a Simple Model of Minimum Clas.s Structure 

We have discrimincti ed between naf:ural association which gives 
rise to our concepts and conditioned association which gives rise to 
the arbitrary sets of symbols whidi constitute languages, and by 
means of which we evoke or trigger our concepts from their latent 
state in memory to temporary activity in our awareness. The role 
of symbols has been much more fully explored than has the role of 
natural associations or concepts. The relations abstracted from 
our SSI's correspond to "invariants" or "constraints" in the 
environment. The environment stamps our DSI's with patterns 
containing these relations, and our mind abstracts them as its 
properties. Thus all natural laws are constraints discovered 
among our I-sets. More formally, they are homomorphisms . Homo- 
morphisms have previously been considered purely mathematical 
concepts, until they were generalized by cyberneticifjts* 
Homomorphisms are relations between structures. They are discovered 
when we observe that one structure (or system, which nay be real 
or purely mental) is similar to a simpler structure, but orily if we 
'Viap" the more complex structure onto the simpler one in a certain 
two-step way. The first step is to simplify the more complex 
structure by "merging" or ignoring some of its ccmiplexity, i.e., 
suppressing some of its structural variety. We need not actually 
lose the discrimination of the variety. We merely ignore it for 
purposes of simplification. Merging or ignoring means to treat 
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as if equivalent. Since variety consists in discriminable differences 
destroying variety means reducing it to a state of non -discrimination — 
to equivalence. The more complex structure now has fewer "parts". 

The second step is to map the reduced complex structure onto the 
originally simpler structure, by a 1-1 correspondence. If this is 
possible, t:he simplified structure corresponds, part for part and 
structural relation for relation, to the parts and relations of the 
second, originally simpler structure. The two structures have become 
isumorphic. They are behaviourally indistinguishable . Isomorphism 
does not mean that the two sets of parts and relations are identical. 
It means that, viewed abstractly, they behave or function alike. For 
example, an electric circuit or a computer can be made to function 
isoTiorphically x^ith a mechanical system or machine. They undergo 
changes synchronously, but the nature of their res^^nses are electric 
and electromechanical and mechanical respectively. ' Likewise, the 
abstract concepts into which the brain analyses or decomposes DSl's 
consist in simplified patterns, homomorphisms abstracted from the more 
complex sets of SSI's, or I-sets. We experience the abstract patterns 
as properties of a DST , or of any object observed; or in fact, of aiy 
mental ot^ect to nhich we can ascribe properties. As an example, there 
is a homorphism between the complex concept "spoken and written 
language without metrical structure" and the simplified concept 
"prose"; and an isomorphism between the dj^covery that one has been 
speaking prose for more than forty years, and the discovery that the 
co'iieepts of prose (and poetry) are honioVphisms. An abstraction is 
a homomorphism a simplification of a more complex structure, and a 
1-1 remapping onto a, simpler one. This process destroys some of the 
variety in the original structure, but conserves some of its internal 
relations . JtStBiHS^ ^^bols associated conditionally with the homo- 
morphisms mSSL tend to reflect some of the conceptual homomorphic 
structure, but because of the 'arbitrary nature of the symbols and 
the partly arbitrary nature of their connecting syntax, we should not 
expect to find linguistic structures accurately nor uniquely mapped . 
on conceptual structures. The same conceptual mapping may be performed 
by different^iiSli^iSiaS^i^^^ language (the dictionary phenomenon); 

abd in different languages. In each mapping the final isomorphism is 
identical with some simplified concept. But are the differently derived 
isomorphisms equivalent, even in the same language? We must expect some 
difference, and where it is great enough, conscious discrimination in 
the form of "near synonyms". As between languages the question of whether, 
a unique isomorphism of conceptual structure can ever be communicated 
(that is, so as to be independent of language) concerns the linguistic 
study of translation, the attempt to preserve conceptual invariance 
in remapping with more widely different symbol sets. 
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Tbs basic assumption underlying the above ideas is that the 
brain, starting with sensory BSX ' s, abstracts from them qualities , 
characteristics^ features, relations, abstractions, abstract concepts, 
or categories and classes . It is this last aspect that we pursue 

According to Piaget understanding arises as the child^^evelops 
the categories or classes into which experience is analysed. The 
same conclusion vas reached by Bruner , Gcodnov? and Austinrl^ 

Much of our commerce with the environment involves 
dealing with classes of things rather than with unique 
events and objects. Indeed, the case may be made that 
all cognitive activity depends upon a prior placing of 
events in terms of their category relationship. A 
category is, simply, a range of discriminably different 
.events that are treattJ "as' if" equivalent. 

The repetition- of these ideas is more apparent than real- 
The aspect that is new is that properties also determine classes. 
Thus '^eaturei' are decision rules, "intension" in logic, that 
"classify" DSI*s, determine their class memberships in abstract 
classe.s. The number of DSI * s that conform to such a rule are 
the class "extension". This inseparable mixture of qualitative 
property which constitutes a means for deciding "like or unlike" 
in property or pattern, and quantitative property which designates 
the number of instances , particulars , cases, or members, occurs 
in all abstract thinking. Every concept, no matter how abstract, 
has scmie i^.efining intension or decision device, and a membership 
or extension. The DSI has the maximum intension we can crowd 
into one e>cperience and, of course, an extension of one. That 
it is also unique, is another aspect of unit extension. Quoting 5 

Restated, the theory is that the qualitative property 
or properties of a set of associated DSl's which constitutes 
an abstraction from the set is the intension that defines 
a class. And the extension of the class is the number of 
DSI's associated through the abstraction class. This may 
be a rather large number, but is always finite. 

There are a great many mental classes, some of which 
overlap in extension through interlinked DSI*s. Many more 
overlap through interlinked abstractions. The number of 
distinguishable characf^ristics of "units" of intension (variety) 
abstracted from a DSI or s 3t of DSI's varies greatly. We 
experience this as abstractions more "concrete" (more like 
DSI's) and more "abstract" '^more removed froia DSI's). 
"Image" is more abstract than "color," "color" is more 
abstract than "green". This relation of ascending abstrac- 
tion can be demonstrated in classes that have a special 
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relation-class inclusion. They exhibit what are somatiTnes 
called ^'levels** of abstraction. Consider the sequence 
shown in Table 1. It is not suggested that the differences 
in intension shown in the table are the minimal steps. 
Introspection frequently shows intermediate steps where at 
first there appear n -ne . The differences of increments in 
"Intension are suggestive only, with the exception of the 
first tv;o, These are fairly certain. As the level of 
abstraction increases, the amount of detail or '^pictorial'* 
content decreases. Between levels 0 and 1 there is a large 
decrease, for eve'n a "photographic*' memory cannot retain the 
detail in a DSI. As the level of abstraction increases the 
minimum number of DSI* 3 required increases. "Minimal" is 
important, for at each level above the second, the probable 
number of associated DSl's is greater than the minimum; at 
higher levels, much greater. As we go up, the minimuni 
extension increases as /J^"^ , where k is the level (k=]. ,2, 
...,N), 0 is the DSI level, and N is the highest level of 
abstraction derived from a given DSI or set of DSl's. More 
precisely, if we express the minimuin extension in units 
of DSI's (which may be considered as roughly equal units 
of maximuin intension, or as large units of sensory "bits") 
we have 

1 = 1^ - I, = intension 
0 k 

£ = 2^"-^ = extension (k = 1,2,...N), 

where is the maximum intension, naniel^^ the intension in 
the unit, or DSI. This relation involving a decrease from 
maximum intension, and growth of miiriimuin extension, is 
suggested in Fig. 4. 
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4: Suggestive Model of Levels of Abstraction 
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Tlie significance of such a structure can be interpreted 
C2/bernetically , An individual DSI is unique. Yet the DSI , the 
oTily object tliat effects communication, cannot bo. experienced by 
two persons, nor repeated by *he -^"^ a pers ubs^ f ey - 

perienr(^ ' 'i^* ■ - s for shai > ng exper'once; ii l'SI's ar,>i ^Sl*s 
.. rr ...ur wni . . .. aial contents nothing would be repetitive, nothing 
similar to any other thing. We could not compare experiences. Ashby 
states: "The most fundamental concept of cybernetics is tbst of 
'difference^, either that two thing.s are recogna zably different or 
than one thing has changed with time"^7 True, but psyche/) }gically, 
"equivalence" is even more fundamental. Just as one can^K, ! discover 
irivariance without change, one cannot experience change i:! sel J unless 
something remains constant. Therefore, starting with DS s &nd 
SSI's only, we should possess so much variety that we could mot experi- 
ence or conceive of "difference". The significance of the ii \w structure 
is that it destroys . enough variety so that differences can be observed ^vrA&Mt^r et:35. 
The senses supply'\A|iiifi5V^but not '^kmSeSai^ differences-. Constancy 
is necessary for becoming aware of differences. Sufficier^t "prerequisite 
constancy" is necessary for use of "prerequisite variety^'' "raat is 
sufficient is discussed in the next section. 

Assuming that our understanding and knowledge (e.^ , of , natural 
laws) arise through constraints on sensory variety anc Uliat the brain 
creates these constraints (or constancy) as a bio-psychol isical artifact, 
homomorphic transformation becomes the chief higher mentE p<rocess . In 
hemomorphic transformation some but not all variety in a :_z^Dmplex structure 
Is suppressed or merged This does not mean that it is S:C:."St The SSI's 
can be recalled intact from memory. Their variety then rrust be suppressed 
in some added, -.s^emi-independent structure. In this strucciure is replicated 
(perhaps by a matching reminiscent of the transfer role .i^f RNA relative 
t«<3 DM) not all the variety in the SSl's but the set of reinforced partial 
patterns The new auxiliary structure must simultaneous^ v act as a semi- 
indepeodent "equivalence class", and yet remain cormected to the individual 
memory traces (SSI*s) so that if one of them is activateti (as in sudden 
recall of a past scene) the interpretive abstract claase::. are also activated. 
The whole functions according to the law of requisite variety. The 
"disturbance" D is the variety in the SSI^s. They and tl r^ir natural order 
of occurence are the envirorjnent T. The auxiliary ^^bs tract) 

structures are the "regulators" R' constructed and interposed by the body 
between D and the essential variables E among which are csuj survival- 
oriented, "intelligent" responses. The set of aLstractl^r^es is a superb 
cvbernetic device. It creates both the needed invariance to support 
ob>3ervation of differences on which depend internal coram :: ication or 
thought, and "suff.vcient similarity" between minds for external 
ccnnmuni-cation . (see next section). It is a masterpiece f evolutionary 
^achievement and simplicity. All the effects are accompli fned in one 
eiconomical process erasing sensory di f feiences . The cognitive 
process can be visualized as two main cybe^rnetic steps: data gathering, 
or production of variety through sensors; and data processing, or homomorphic 
simplification through natural associativa pattern filters reductlom 
of variety into abstractions. Each process and supporting brain structure 
requires the other. Together they constitute the basic devices of cognition. 
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V Requisite Constancy 

We have conjectured the existence of structures in the brain 
(call them A-sets) derived from naturally associated sets of stored 
sense impressions (I-sets). Each member of an A~set has the nature of 
relation, an equivalence class for its meniber SSI's and a qualitative, 
'^property" vif the I-sets. It is also the intension which defines a 
class. For the special case of an A-set based on tras-lite T-set we 
suggested a simple model, giving the minimum ntmiber of naturally- 
associated SS''s required to form the i^^ "level": 

where m is the highest level constructed within the brain on that I-set 
as base. Then the total niinber of naturally associated SSI's would be 
at least 

iL m-l 
N(m) = n(i) l+2-f4+8-h— +2 

■ i=i 
- 2^-3 

In this model each successive level contains -tmsta^ one .^amm 
naturally-associated SSI than the entire sum of all those in lower 
levels. What interpretation can be placed on this rapid increase 
in minimum class extension? Again there is a straightforward 
cybernetic interpretation: the levels represent decreasing variety 
or class intension (or partial-patterns) in the set of SSI's which 
define the class, A decrease in variety may be regarded in several 
ways. Most obviously, it represents a lessening of constraints, so 
that more objects can be found that comply with it as a defining 
rule for a class, i.e., class extension increases. Another aspect 
is that, if range in variety used to define a field is narrowed, 
the range in the corresponding concepts is narrowed, or the concept 
s tability increases. That is, concept stability would be some inverse 
function of range in permissible variety. But class extension. is 
such an inverse function (although not necessarily the correct 
function) Hence we may regat|?d the increasing extension and de- 
creasing intension in the abstractions as a measure (of some sort) 
of increasing conceptual stability . Finally, there is another aspect 
of the same phenomenon. The decrease in class intension also corres- 
ponds to increase in versatility of response behaviour, on the part 
of the person whose brain is involved, lliere are innumerable 
examples. For instance, when John is five, the question "how many 
are two cows and three horses?" is no poser, for he does not see 
the difficulty. As he matures, he senses that cows and horses ■ 
differ and cannot b| ^^-^ developed still more levels, 

he calmly uses ^ tf^ Ltppft y^levkl , and answers that "two dcmiestic animals 
and three domestic animals are five domestic animals." Thus ho. has 
acquired add itivity by use of a higher, more abstract level. The 
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physicist does the same when he insists on "dimensional homogeniety " 
in the terms of his equations. The mathematician strips all the 
internal structure of intension from his mental objects, leaving 
only their externally discriminable structure their number and 
order. Ke thus creates the integers, and thereby gains still more 
versatility. Greater generality is equivalent to less defining 
intension of a class, to less variety. And the rule is that, the 
more general the class, the more "particular" cases it subsumes, and 
the more versatile is the mental owner in performing mental operations. 

Now let us look in the opposite direction. At the bottom 
level is the SSI. It contains the maximum variety we can experience 
in one observation. It has the least versatility. Using only the SSI 
we cannot communicate internally in thought, or externally, in a 
message. If we look a few levels higher, we still cannot communicate. 
'For example, we cannot communicate the level 2 in Figure 4. The 
uniqueness is still too great, i.e., there exivSt no comparable structures 
in the mind of another persgn. However, at a certain level the stability 
in conceptual structure in two minds begins to be sufficiently simil ar 
so that the indication, of the concept structure by one person finds 
something similar which can respond, in the other person, l^^len this 
hypothesized threshold of stability for communication is reached, 
there is an enormous simultaneous increase in versatility. For 
now the two persons can function as a social unit, mutually assisting 
each other's goals. This hypothetical level we indicate by n(c), 
the minimum number of natorally-associa ted SSI's for the threshold 
conceptual stability necessary for interpersonal communication. 
Evidence for the existence of such a threshold are the facts that 
we cannot communicate DSI^s and SSI's, arid that we can and do communi- 
cate by signs or symbols that evoke abstract classes. For levels higher 
than c, communication becomes easier and easier. The probability that 
our conditioned associations (symbols) evoke ajw^aa* concepts J^'C^f they 
are present) becomes better and better. This idea underlies the ease ^Li> }n 
of communication by small groups of professionals who share the same ^ ^'^^^of^ 
sets of abstractions. The precision of mathematical concepts, for 
example, is no accident. The fact that they can be (not necessarily 
always are) so precise is attributable to their enormous suppression 
of variety. They are actually derived from very large I-sets of which 
the mathematician loses awareness. In fact the difference between 
mathematics and other sciences lies in a kind of superversatility — 
the mathematical concept structure is not necessarily reconstructed 
so as to embody the constraints of patterns of variety observed in 
the real world of DSI's. Yet one of these constraints persists . in 
a way that the mathematician must observe. He uses it as his link 
with "empirical reality*^ The constraint on his synthetic patterns 
is that of validity of proof. Proofs allow enormous simplification 
(a homomorphic device) since they permit suppressing the variety in 
lower levels of abstraction, and retaining only the reduced patterns 
at the higher levels. A proof is a rule for interlevel transitions, 
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"inference", in logic. The f:x.-^.mple below applies only to levels of 
abstraction based on tree-i/ke I-sets, as in the simplified model: 

Hypothesis 1: All A is B . (If A then 
Hypothesis 2: All B is C (If B then C) 

Conclusion : All A is C (If A then C) 

Assuming the mathematician can demonstrate tlie fruth of the 
conjunction of the hypotheses, then a valid conclusioTi follows, and 
the middle term B is unnecessars^ as In ract js A, since the whole 
process is designed to show nhat A also conforms to rule L. This 
type of constraint, empirical in origin, has the vorsatillty of 
applying to any three levels . The example merf^ly rug;;;':;?<.s that logic 
too is a device for reducing variety, for regulntirg anr! co-ir.rol ling 
it, i.e., for increasing the versatility of its usf-r. 

Let us now see how these mental structures in IS path segments 
ab and cd determine physical access iw segment be. Intellectual access 
occurs, of course, only in segment cd. 

VI Access to Knowledge As Control over Variety 

To siTnrmarize our position: knowledge does not exist fn the t^ecofCPerD 
literature ^MsniMUipBi nor in any symbols for classes. Knowledge 
exists only at the terminals of the IS pathy in the form of abstract 
classes organized from and connected with stored sense impressions. 
The literature through which access is obtained exists in collections 
of LD messages containing symbols- Therefore, in discussiong access to 
information and knowledge what we really mean is access to arbitrary 
patterns, located externally, of themselves without meaning. By prior 
conditioned association, however, they can, when sensed, i.nteract with 
and evoke into awareness our internal, naturally associated patterns. 
The new internal superpattern temporarily reconstituted in awareness 
can, through Its texture of choice and order, convey information and 
add knowledge. The abstract mental pattern is the basic form, of reduced 
variety. The greater its stability the surer the contact between the ex- 
ternal symbol and the 'delicate structures of understanding wb.ich only 
abide within the living organism. 

The problem of access to knowledge can be analysed in various 
ways. One of the most aystematic would be to take advantage of the 
structural organization provided by the IS path, and consider it by 
segments and stages within ^segments. Because the path of propagation 
of a message along this path is directed, a directed-grapli representa- 
tion might be useful. We should regard each stage as a channel, 
limited in its type and capacity to transmit quantity of variety per 
second, A model has been based on the rate at Interface c (rate of 
sensing symbols, on the part of a human or a machine"). This model 
showed that limited channel rates for sensing variety have shaped all. 
IS systems: 
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"due tc the accumulation of LD messages, any 
inquiry takes place among vastly more messages than 
can be sensed in the available time. Out of this 
constraint on use of stored information arose, we 
suggest, the main frairiework of the system of homomorphi.c 
transformations which underlie "bibliographic access" 
to stored literature . "^ 



Let V he the average sensing rate for units of variety (e.g., 
words /second) . Let T be the number of seconds during which sensing 
occurs. Then 



VT = M 

is the total number of words sensed during T, or the "message length", 
measured in words. This quantity is numerically equal to some fraction 
of the collection sensed: 

VT = M 



^ _ riA 
CK " CK 



where N is the total number of units of variety (words) in the collection, 
n the number of major units of variety (e.g., documents or volumes), A 
the average number of words per major unit (words /document , or words/ 
volume); K is the average homomorphic compression factor, and C the 
initial selection factor. Since the main constraints are in the small 
magnitudes of V and T, (that is, we can sense only a short message within 
any reasonable time) we organize our searches around the size of the 
collection N, the compression K ( ]<; = number of words in original message 

number of words in corapressed message, 
a ratio, or pure number without dimensions), and the power of the classi- 
fying system to divide the collection, i.e., to eliminate all but a fraction 
N/C of the original collection (K=l), or N/CK of a compressed or surrogate 
collection. C is also dimensionless . The searches take place on the 
compressed collection, and their objective is not to sense the literature 
but to decide what literature to sense. Without going into details 
we may say that the ranges of C and K are the same: 

l^C^^N- l^^K-N 

and that the types of searches depend on the numbers of prior-prepared 
compressions (values of K) , and structural organization which permits 
selection (dividing the collection by C). In all cases, c^xrept: that 
of direct sensing of an original collection (K=l), the search f^tarts 
at the highest value of K. N/K is the total compressed collection. 
If access is, for example, through catalog cards only, then the 
collection is compressed unit by unit (by voliane), and K would be 

^ _ number of words per volume 
number of words per card 



ERIC 
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For example, if A = 100,000 words /volume, and the average 
number of words per catalog card were 100, then 

K = 100,000/100 = IGOO 

For such a system the sensing takes place on the 'image*' rr ''surrogate" 
collection of cards. At the start of the selection C = 1 (nothing has 
been selected). After one word had been, eliminated, the value of C 
would be 

Q ^ Original number of words 
Remaining number of words 

N 



(N-1) N-1 
K 

in other words, duriiig search C increases from 1 to N (the end of the 
search, when the last: word is about to be sensed); while K has one or 
more constant values, which decrease until K = 1. In all cases the ■ 
manipulation of K and C follows a certain order. Since the act of 
selecting requires variety of more than 1, the values of K must be 
reduced before the value of C can be increased. For example, suppose 
a system is used (somewhat like the Dewey Decimal System) in which the 
collection is first divided into ten parts, if the entire collection 
were first represented by one word, say the word ^'Collection'^ then the 
value of K would be, initially 

K == JL = N 
o 1 

and no selection could be made from the one word. If however, K were 
reduced so that (decreasing K by a factor of ten) 

^ 10 

then it would be possible to select one or more of the ten words. Thus 
the- variety in the compressed collection had to be increased from 1 to 
10, before the selection from a variety of 10 could be made. In any 
decision process, there must be a variety of at least 2. In information 
search with a classification system we alternate decompression (the 
inverse of the ^<^<^^^P^i^^c^g^^|ss^^^ that took place when the more 
abstract class was fovmB^ j j^wego' ?rom the abstract to the concrete, 
always keeping the amount of variety to be handled at each stage 
small (and therefore the sensed message M short). In this way the 
total sensed niessage: 

■ H = VT = VT^ + VT^ + ... Vl^i 

= Ml + M2 + yta 
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where m corresponds to the lowest value of K greater than 1. 
At K=«l the nature of the search changes from dec is ion-making 
(selecting what to sense) performed on the homomorphically- 
mapped image-colleccion, to sffis ing the selected part N/C of 
the original collection. .Advantage can be Cei^en of this change in 
function, as described belov. Ine successive reductions in K permit 
the selection process to take place rapidly (since at each stage 
j only the few words are sensed) and still preserve exhaustive 
coverage, since the homomorphic compression represents the whole 
remaining collection at each stage. Figure 5 shows schematically 
the inverse changes in K and C, with those in K preceding those 
in C Note that after K = 1, the rate of climb of C is much slower, 
since the "reduction" of the collection now occurs only at the ordinary 
reading rate V. The entire time of search can be symbolized by 

VT = VT (K>1) + VT (K=l) = M (K>1) + M(K=1) = M 

in which the length of M and T are controlled by regulating the amount 
of variety sensed.. The analysis also suggests that the two major 
cognitive variables in access to knowledge are contributed by two 
types of homomorphic mapping: selection, in which the part that remains 
is isomorphic with the original part; and compression , in which the 
new structure is homomorphic with the entire original structure. We 
are thus led to the generalization that search for information in an 
LD collection consists in strategic use ^-p- |y&^-i.^S.6^?.| v homomorphisms , 
combined with instantaneous isomorphic selections. It suggests -that 
there is an approximate^ constant: 

KC = N/VT 

or rather, a parameter N/VT determined jointly by the size N of 
collection and VT, the mes.^age length convenient for the speed and 
time available to the sensor.. This hyperbolic relation between K 
and C holds, of course, only over the region in which both vary, 
namely, the region of decision~making or selection, K>1. Since C 
can vary only with change in it further suggests that designers 
of future information retrieval systems, or planners of search 
strategies, should consider the number and values of built-in K levels - 
as prime factors in design. K and C respectively reflect the two 
suggested basic mental processes: K, homomorphic simplification and 
suppression of variety; C, isomorphic sensing of variety, coupled 
vrith its elimination by selection. K and C can both be regarded as 
homomorphic mapping processes, one by compression into a simpler structure 
with retention of certain invariants (the meaning), C by mapping an 
original structure into a binary function (1,3)- The result of the 
latter is to select parts of the original without otherwise altering 
them. 
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Another suggestion for improving access arises from this 
model, KachinGs can npv recognize patterns of variety that humans 
cannot; and for patterns which both can recognize, machines can be 
^ore rapid and reliable. A **trade'» is possible between a wide range 
in variety of patterns recognized by humans (for example, in reading 
various kinds of handwriting and printed fonts) at slow speeds, and 
a much narrower range in variety necessary to make the machine reliable* 
The machine requires a degree of stability of symbolic pattern much 
higher than does the human. If this cost is paid, however, machines 
can sense at much higher rates than can humans, vSince they must ex*- 
tract features from the patterns in order to recognize themj, this 
"classification" by features also permits feature-class coordination. 
Thus machines can '^make decisions" and "select", without understanding 
the meanings of the classes coordinated. This is one advantage which 
can be taken from the separation of the functions. The machine re- 
lieves the human of the work of sensing VT(K>1)3 and leaves him only 
that of sensing VT(K=1) , where the meanings are required^, 

An extreme saving of another kind can occur if the machine 
pattern recognition speed is great enough. Suppose K=C=1, that ±s^ 
there is no compression, and no selection. Then a pattern recognition 
machine that had a speed 

V = N/T 

could entirely despense with "classif ication"^ The decision-making 
preliminary phase would be eliminated, and the entire collection N 
could be "unstructured" yet searched in acceptable time T« This solu- 
tion would not, of course, allow a human to derive meaning from the 
sensed collection,. It would,, however, allow the human to select from 
and locate within the collection any pattern or combination of patterns. 
For certain purposes the unstructured collection might cerve as well 
as the classified collection. Prior to the age cf machine pattern 
recognition humans had no choice but to clas&ify collections. Thus 
the reduction in variety represented by the resulting machine capability 
gives increased versatility to the human "regulator" of the variety. 

Compression of the range in variety underlies all physical 
coiapatibility of real S3''stems of machines. Dramatic examples occur 
wher., for example, the gage of a railroad track is standardized, 
the dimensions of camera films and microforms are agreed upon by the 
makers « They permit interchangeable use of railway cars on different 
liines, interchangeable cameras and developing equipment, uniform microform 
storage and viewing equipment. Mass production begins by compression 
of variety to the point at which large numbers of artifacts form 
"equivalence classes" defined by common mechanical, electric and other 
tolerances « Failure to reduce variety, on the other hand, reduces 



versatility, blocks the integration of systems and networks, and 
keeps tliem isolated and small in class exten.siofi, In planning 
'^global" ciassif icTition systems for knowledge, be not-so-birldcn 
adversary is again the range in uncontroll .d ■ ' y Ihe v;ay 
to control is '*3taadardi;:at ion" . f all • u:-.^ of symbols in 
the same languag^e^ between languages and r;. fn:i .?.men tally 
(because hardevSt to do) ri the conceptual . sse^ witli vrnich the 
symbols are condiiionally associated. Failiiro ■ "macliine" trransla- 
tion is attribul. ;le to this last:. m luai patterns fotTned 

from DSl's exper. .'need at one latiLude of : he '-o , and one time 
in history, differ from tliose abstracted froir. di. erent" "geoCemporal" 
locations. With systeir.atic differences in ;ii;-;tz - u pattern, and 
accidental differences in syntactical s true Lure.v . .lot only are. exact 
word~f or-vjord or senteiice-f or-sentence translations of natural 
languages impossible, but there must remain fundamental variety in 
conceptual pattern, equivalent to variety in mearilng. This suggests 
a reorientation jLn such efforts, the prime objective being to detemine 
the qualitative differences in concept patterns between language pairs, 
and the resulting limits on the "equivalence classes" that can or cannot 
be overcome, It is scarcely accidental that computers vhich could at 
first respond only to 'Viachine" languages were incompatible. Computers 
gained compatibility in proportion to their capacity for "higher", 
more abstract languages, more flexible and versatile. Knowledge 
as a set of interrelated abstractions is most stable vjhere most abstract. 
But any class at any level should be recogni.^ied for what it is -~ 
an organic artifact evolving, adapting, growing. The stability of 
a class is "attacked^' (disturbed) by new insights at all levels but 
especially at the lowest levels with unit extensions, SSI's. Greater 
intellectual and instrumental capability to discriminate is the motive 
power of science. But it produces increased variety in the form of 
interstitial interpolations of new classes . Thus any precoordinated 
system for describing knowledge crumbles, most rapidly at the bottom 
(close to the DSI * s of new experiment) and less at the top. But no 
classes are petinanently fixed. 

In this light the stability of any particular subject area 
depends on several factors. One is the kinds of data (DSI^s and SSI^s) 
from which, the class structures are abstracted. In the physical 
science^:, for example, there are. very long, 'irepea Lable" sets of data 
(SSI's). The physical sciences are those in which, by and large, 
variety is most easily suppressed, for they lack the enomous variety 
of living things. It is logical that man^s ascendency over nature 
through the relations of science took place first in the physical 
sciences. Even here, concepts based on classes were highly unstable 
until certain higher levels were achieved, such as the insights of 
Copernicus, Keppler, Newton, and many others. Then with an intellectual 
^^floor" of constraints on the variety of physical objects and force 
fields, the -more complex structures of organisms and societies of 
organisms could be approached. Most resistant are those extremes of 
complexity which the cyberneticis c calls the "very large system" — 
large in tem\s of overwhelming variety, too large for more than broad, 
statistical homomorphisms . Conspicuous among sciences. of- large systems 
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are the social and behavioural sciences. Here the lengths of sequences 
OL SSI's that cr.-stitute the basic data are still short, and the 
stabiil*-y of classes correspondingly lower. Searchers for subjects in 
these areas experience this instability as a wider variety in nomenclature, 
different names for the same concept, or more 'confusing, the same names 
for slightly differing concepts. They look with envy on the conceptual 
stability of certain areas of the physical sciences. They should not 
be disturbed. They have to do with smaller class extensions, where, 
of course, the extension of a concept is always the finite number of 
SSI's that are naturally associated in the class. As the experiences 
of the field grow, the increasing extensions of these classes will 
push toward higher stability through the emergence of broader abstrac- 
tions. These will apply more "universally", i.e., appear to be stable 
for more cases that are tested against their subjective thresholds of 
perception of differences In variety. 

The same relative instability of class structure should 
occur in a field in which some kind of uniqueness is inherent in the 
subject. This is oujzstandii^y the case for artifacts such as paintings, 
sculpture, or archici^^uraf^uildings . These are unique, as are the 
critics' evaluations of them. There is a built-in uniqueness in a 
literary artistic or tnusical work in which, in spite of LD multi- 
plication, the value of the work resides in the individual selection 
of the author, from the common store of variety such as the words of 
a language, the positions and.^olors and strokes of paint on a canvas, 
the spacing of musical notes. Here again the secondary literature 
about the primary work, the criticism, abstract, or citation, has 
much^irreducible variety which blocks suppression. 

Faced, then, with improving access to the literature of the 
socifil and behavioural sciences and the arts and humanities, of which 
the common feature is some aspect of uniqueness, small classes, and 
wide variety what is the general direction that theory indicates 
our efforts should take? The answer should be clear, if we assume 
the ideas set forth above. We must consciously search out and 
suppress variety. I^Jho will do this? The actual reduction of 
variety v^ill be an immense task performed by thousands of workers. 
What devices will they use? Many are familiar, and have already 
been discussed: standardisation of nomenclature, of indexing methods, 
of formats; establishment of intersystem compatibility; and special 
search strategies. One general result of practical experience should 
be stressed: to suppress variety has a cost. There is an inevitable ^ 
''trade", a quid pro quo in every solution. 

The theoretical pattern is simple. Because regulation 
involves variety and especially the flow of variety, and its flow 
anywhere on the IS path involves the same general elements, solving 
problems of access to knor^ledge involves the same general types of 
difficulty, same kinds of remedy, same general needs for exchange. 
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us close by pointing out that, as so often occurs, 
practice has preceded theory. Many or most of the devices mentioned 
above have been ^known, triad, used. Reasons for use resembled those 
suggrr*:e^. hy the present liypothesis. An outstanding example is the 
class of devlcci called thesaurus. Thesauri play essential roles in 
larger information search systems, especially in automated ones. Isliile 
their greatest application has beeT^^^fean-machine interfaces", they 
are also used in "machine-machine interfaces" under other names such 
as 'Adapters" or "higher languages". The essential situation is that 
a large variety on one side of an interface cannot be tolerated on the 
other. In the case of large populations of users of an information 
system, the approach to the system is through '^questions" which are 
structures of natural languagT^s. All symbols, whether in "controlleJ" 
or 'natural" languages, have been called here "conditionally" associated, 
(consistetitwith Pavlov's discovery that any random stimulus pattern 
can, if suitably reinforced, be firmly associated vjith a "naturally" 
associated pattern'Sr. Naturally associated, patterns range from 
automatic reflexes^to higher concepts; in all of them systematic constraints 
are jointly imposed by body an.d environment. In natural language 
there are many sources of variety- One is the large number of synonyms 
and near-synonyms for the same or closely sirailar concepts (source 
of the original Roget). Another is the variety of codes for the 
same symbol (spellings), and in suffixes and prefixes to word stems. 
A third is the semantic dislocation by homonyins use of the sair.e 
symbols for different iSSJSI^^'All those ard others can be present when 
access IS by single or short compound terms. Tlie variety is so great 
that an information system, particularly a machine in which access may 
be through a few codes or a single code, cannot respond wi^^out large 
variety e.g., low ^'recall" or low "precision", or both. 

A natural language thesaurus' is used at interface c of the IS 
path to regulate the variety in segments cd of vocabulary among 
different searchers of an LD collection in their central segment be. 
As explained, the variety regulated does not occur so much in the 
searchers' concepts (natural association) as in their access symbols 
(conditioned association). Let us simplify and assume a number of 
symbols, all equally likely to be employed by the population of 
searchers in access to a set of stored items. Suppose only one of 
these c^in actuate the access mechanism. Then the probability of 
retrieval of the set of items is 1/p without the thesaurus, ciose to 
1 with it, if the p symbols nearly exhaust the variety in population 

^ess-vocabulary . Thesauri can be used with either pre- or post- 
r:oordinatioti systems, but may have greater utility in the latter, since 
:•:< rf^<:oordination is equivalent to use of a syntJ^ctio ordering, which 
r '^fJ-ces variety* Thus a thesaurus would be particularly suitable 
ii\ new fields, where new vocabulary is still being rapidly "coined", 
or that in ust is having its "edges clipped" by free homonymity. The 
versatility of the searcher is increased, since achieving his goal 
is less dependent on his particular choice of index symbol. This 
stability is achieved bv forma t±on of a larger equivalence class. While this 
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muc : larger class extension has a slightly lower intension than 
am of its components, a single coordination, at most a, few, rapidl}^ 
restores the intension of the original concept. Thus there is enormous 
gain in system performance — matching the man to the machine. A 
thesaurus is a "funnel" into which variety is poured hapharzardly , and 
out of which flows a thin, uniform stream which can be accurately 
directed into the narrow aperture of a small-neck '^bottle" (machine 
or rigidly constrained system). It is an adapter. On a more abstract 
level, a higher language that endovjs computers and programs \iith. the 
versatility of inter-communication and use is analogous zo a thesaurus. 

The cost, or exchange, is of several sorts. In postcoordination 
the searcher reserves greater freedom of "search strategy*' but pays 
through performing more coordinations in order to achieve exhaustive- 
ness (recall). The thesaurus cost includes that of the extra searches 
it makes possible. By far the greatest cost, however, is the task 
of setting up the thesaurus categories. These are best undertaken 
on a discipline-wide basis — by all who expect to use the instrument. 
This suggests that in the social sciences and the humanities perhaos 
the most logical point of departure is to set up thesauri for the 
various subfields in profession-wide efforts. This would attack 
intra-subf ield variety. At the same time^ to assure eventual inter- 
field compatibility, the various subfields should coordinate. The 
latter would involve careful comparison of concepts, and of terms 
used for access symbols . Obviously, the larger the area of knowledge, 
the greater the diversity and greater the work of suppressing variety 
in common concepts. Thousands of workers and many years may be 
required to achieve stability of access to knowledge in the social 
sciences and humanities. Fundamentally we are not removing variety 
from symbols so much as from ourselves in broadening the basis 
through shared knowledge. In this effort we may take satisfaction 
that cybernetics and information s c ience are beginning to guide us. 
But we are not yet out of the woods. The light they shed is still 
feeble. It^goints out a direction, but not yet paths or their 
difficulty . 



-30- 



L.B. Heilprin 



Referencf^s and Notes 



1. 



2. 



3. 



5. 



6. 



7. 

8. 
9. 



Ashby, W. Ross< 



Heilprin, L.B. 



An Inrroduction to Cybernetics . London, 
Chapman Hall, 1956, Chapter 4. 

*^Inf cnnation Storage and Retrieval as a 
Switching System", Switching Theory in Space 
Technology, Aiken and Main, eds., Stanford 
University Press, 1963, 298-332, 



Reilprin, L.B. and Goodman, F,L. "Analogy Between Information 

Retrieval and Education", American Documentation 
16: July 1965, 163-169. 



Heilprin, L.B. 



Heilprin, L.B» 



"Toward a Definition of Information Science", 
Automation and Scientific Communicatio n, 
Proceedings 26th Annual Meeting, American 
Documentation Institute, Chicago, 111., October 
1963, 239-241. 

"Critique and Response to Paper by Jesse H. 
Shera: An Epistemological Foundation for 
Library Science", The Foundations of Access 
to Knowledge, A Symposium, Montgomery, E.B., 
ed«, Syracuse University Press, 1968, 26-35 c 



HeilprTn, L.B. and Crutchfielv, S.S. "Project Lavsearch: A Stat- 
istical Comparison of Coordinate and Conven- 
tional Legal Indexing", Parameters of .Informa- 
tion Science, Proceedings 27th Annual Meeting, 
American Documentation Institute, Philadelphia, 
Pa., October, 1964, 215-234. 



Heilprin, LcB. 



"On the Information Problem Ahead", American 
Documentation > 12, January, 1961, 6-14. 



Ashby, W. Ross* Log. cit., Chapter 11- 



The LD propagation mode underlies communication 
of both cognitive and affective messages o A 
message can be regarded as a itwo -component 
vector (C,A) • The messages of science have a 
larger C than A component, as measured by 
responses of the recipients; while those of 
the arts and humanities evoke a larger pro- 
portion of A to C response in the recipients. 
Neither component is ever completely absent. 



10. Ashby, W* Ross* 



Loc. cit. Chapter 13, p. 247o 



-31- 



L.B. Hellpri- 



11 o Bridgman, P.W. The Logic of Modern Physics , Ma AUllan, *,932, 

page 5. 

12 o Ashby, W. Ross. Loc . cito Chapter 6 

13. Random House Dictionary of the English Language^ CoEleRe edition, 

Random House, 1968, po 1062. 

14. Moliere, J.B.P. Le Bourgeois Gentilhomme , 1670. ;U.zt II, Sc 4o 

15 o Piaget, Jean Judgment and Reasoning in the Chi Id ^ 1928 

Translated, reprinted, Littlefield, Adams, 
1966. Chapters IV, V. 

16o Bruner, JeSo, Goodnow, joj., and Austin, G^A. A Stuo-r of Ifh inking 

Wile>, 1950. p. 2jI. 

Loc o cito Chapter 2, p. 9. 

Loco cito Chapter 7, ppo 130-4. 

"Technology and the Future of the Copyright 
Principle", American Documentation 18, Januaty 
1968, pp. 6-11. 

Conditioned Reflexes , 1927 translated, Dover, 
1960o 

"The Testing of Index Language Devices*', 
ASLIB Proceedings , 15, Vol 4, April 1963. 



1^. i. ghby, W. Ross 
18. Ashby, W. Ross 
19o Heilprin, L.B. 

20-0 I.i?. Pavlov 
21 0 Cleverdon, CoWo 



22. 



Thanks are due Linda S. Handy for her long, 
patient care in typing this manuscripts 



